ECOSMOG: An Efficient Code for Simulating Modified Gravity 
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We introduce a new code, ECOSMOG, to run A*'-body simulations for a wide class of modified gravity 
and dynamical dark energy theories. These theories generally have one or more new dynamical 
degrees of freedom, the dynamics of which are governed by their (usually rather nonlinear) equations 
of motion. Solving these non-linear equations has been a great challenge in cosmology. Our code is 
based on the RAMSES code, which solves the Poisson equation on adaptively refined meshes to gain 
high resolutions in the high-density regions. We have added a solver for the extra degree(s) of freedom 
and performed numerous tests for the f{R) gravity model as an example to show its reliability. We 
find that much higher efficiency could be achieved compared with other existing mesh/grid-based 
codes thanks to two new features of the present code: (1) the efficient parallelisation and (2) the 
usage of the multigrid relaxation to solve the extra equation(s) on both the regular domain grid and 
refinements, giving much faster convergence even under much more stringent convergence criteria. 
This code is designed for performing high- accuracy, high-resolution and large-volume cosmological 
simulations for modified gravity and general dark energy theories, which can be utilised to test 
gravity and the dark energy hypothesis using the upcoming and future deep and high-resolution 
galaxy surveys. 



I. INTRODUCTION 

The accelerating expansion of our Universe is one of the 
most challenging questions in modern physics [1]. After 
more than a decade of attempts in constructing consis- 
tent and natural theories for it, we are still nowhere near 
a definite conclusion. Broadly speaking, the theoretical 
models developed so far roughly fall into two categories: 
those involving some exotic matter species, the so-called 
dark energy, which usually have nontrivial dynamics, and 
those involving modifications to the Einstein gravity on 
certain (usually large) scales. Examples of the dynam- 
ical dark energy include the quintessence [2], k-essence 
ams2000, coupled quintessence [4], chameleon field [5, 6], 
symmetron field [7] etc., and some examples of modified 
gravity models are the f{R) gravity [8], scalar-tensor the- 
ory [9] and DGP model [10] (for some recent review see, 
e.g., [11-13]). 

From a practical point of view (i.e., regardless of the 
naturalness or fine-tuning considerations), there are two 
important issues faced by any theoretical model: consis- 
tency and degeneracy. 

In order to ensure the consistency of the mode, a given 
model should not violate any existing observational con- 
straints. The new degrees of freedom, which are supposed 
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to drive the accelerating expansion only on very large 
scales, quite often produce unwanted side effects. Con- 
sider the coupled quintessence as an example; if it couples 
to normal matter species (baryons etc.), it can mediate 
a fifth force that is strongly constrained by experiments. 
This problem can certainly be avoided by assuming that 
the quintessence field only couples to dark matter which 
we cannot directly observe, but more interesting theoreti- 
cal mechanisms to avoid this problem have been designed 
in recent years. The leading example is the chameleon 
mechanism [5, 6, 14]. By this mechanism, the fifth force is 
suppressed to the undetectable level in regions with high 
matter density, while in low-density environments such as 
galaxies, galaxy clusters and cosmological voids, the fifth 
force is unsuppressed and can be as strong as gravity. The 
rapid changes of the behaviour of the fifth force across 
different regions naturally imply that the equation(s) of 
motion governing the dynamics of the new degree(s) of 
freedom should be very nonlinear. This adds the com- 
plexity to the model but we cannot avoid this to ensure 
the consistency of the model. 

Different theoretical models can behave very similarly 
to each other and to the standard ACDM paradigm, 
which makes it difficult to distinguish amongst them us- 
ing observational data. The (apparent) existence of the 
degeneracy usually indicates the incompleteness of our 
theoretical understanding of models. For instance, it is 
very easy to design a scalar field model which produces 
almost identical cosmic expansion history as ACDM, but 
this is merely because in the background cosmology, any 
detailed structure of the Universe is disregarded; when we 



2 



look at the linear perturbation evolutions, the apparent 
degeneracy is often broken. However, there are situations 
where the degeneracy cannot be broken even with linear 
perturbation analysis (such as the chameleon theory and 
f{R) gravity in which the length scales on which linear 
perturbation analyses apply are bigger than the range of, 
and so not affected by, the fifth force). In such situations, 
a full study of the nonlinear structure formation and evo- 
lution needs to be performed to break the degeneracy. 

Therefore, it is essential to study non-linea structure 
formation to ensure the consistency of the model and 
break the degeneracies between various models. Non- 
linear structure formation is too complex to understand 
analytically and therefore requires numerical simulations. 
Numerical simulations of the cosmic structure evolution 
on small (such as galactic or cluster) scales can not only 
open a new arena of using consistency with observations 
to constrain models, but also hopefully break the degen- 
eracy between different models (as our cosmological sim- 
ulations show below). This is particularly exciting con- 
sidering that high-quality observational data will keep 
coming in the next decades to improve our theoretical 
understanding. 

Numerical simulations arc done by numerical codes. 
In most cases the extra dynamical degree (s) of freedom 
is more accurately treated as a field and its value is re- 
quired on a number of space points. For this purpose we 
need the numerical code that can solve the field value on 
meshes covering (parts of) the simulation box. There are 
two such codes known to us to date: one is that of Oy- 
aizu [15], which has been applied to f{R) gravity [16, 17] 
and the DGP model [18]; the other is a modified MLAPM 
code [19] written by one of the authors [20] and which 
has been applied to chameleon theories [21], f{R) grav- 
ity [22], coupled scalar field theories [23, 24], scalar-tensor 
theory [25], dilaton model [26], symmetron model [27]. 
Both have shortcomings: the Oyaizu's code does not sup- 
port adaptive refinements (thus easier to implement) and 
high-resolution simulations arc not practical; while the 
MLAPM code does support adaptive refinements, it is not 
parallelised and not practical for simulations with very 
big volumes and high mass resolutions. Furthermore, the 
equations on MLAPM refinements are solved on a one-level 
grid, which is rather inefficient. 

The aim of this paper is to introduce a new code 
ECOSMOG, which overcomes the shortcomings of the pre- 
vious codes. The new code is based on RAMSES [28]. It 
is efficiently parallelised, supports adaptive refinements 
and solves the equations using multigrid method on the 
refinements (for a more detailed comparison of the three 
codes the readers can refer to the table I). As a work- 
ing example, we use the new code to run a number of 
test simulations for the f{R) gravity, which is one of the 
most challenging models to simulate because of the high 
nonlinearity of its equations. As will be shown below, the 
code works very well for the f{R) model, and we certainly 
expect it to be straightforward to implement equations 
in other models to our code. 



The organisation of this paper is as follows. In § II 
we briefiy introduce the f{R) gravity model. In § III we 
introduce the supercomoving code unit using a different 
form and list the TV-body Poisson and f{R) equations 
to simplify the numerics. § IV then makes discrete ver- 
sions of these equations to be implemented in our code. 
§ V, we show details on how the numerical implementa- 
tion is performed and discuss several important issues, 
which is the core section of this paper. Next, a long sec- 
tion, § VI, contains the results of eleven tests of the code. 
These tests check the correctness, efficiency and consis- 
tency of different aspects of the code and give us confi- 
dence about its reliability. Finally we compare the present 
code with other mesh-based codes, summarise and con- 
clude in § VII. 

This is a paper to explain the code and physical in- 
terpretations of the results will be presented in future 
publications. 



II. A TEST CASE: THE f{R) GRAVITY 

One can alter the Einstein gravity in such a way that 
it gives rise the cosmic acceleration without introducing 
dark energy. One example along this line is the so-called 
f{R) gravity, in which the Ricci scalar R in the Einstein- 
Hilbert action is generalised to a function of R (see e.g., 
[13] for a review and references therein). In f{R) gravity, 
the structure formation is governed by the following two 
equations, 

= -^a'SpM + y5i?(/fl,), (1) 
y^SfR = - y [SRifu) + SttGSpm] , (2) 

where <I> denotes the gravitational potential, fa = ^^^^ 
is the extra scalar degree of freedom, dubbed scalaron, 
Sfn = .fftiR) - fR{R),5R = R - R,SpM = Pm - Pm, 
and the quantities with overbar take the background val- 
ues. The symbol V is the three dimensional gradient 
operator, and a is the scale factor. These two coupled 
Poisson-likc equations are much difficult to solve than 
the single Poisson equation in General Relativity (GR), 
V^<I> = inGa'^SpM, which is linear. From Eqs (1) and (2), 
we can see that gravity in f{R) can be enhanced depend- 
ing on the local environment - in underdense regions, the 
(5i?(/fl) term in Eqs (1) vanishes thus the two equations 
decouple, making gravity simply enhanced by a factor of 
4/3. However in the dense region, 6 fa in Eq (2) is neg- 
ligible, yielding SR{fii) = —SnGSpM, which means that 
GR is locally restored. This is the chameleon mechanism 
applied in the f{R) gravity models, which is important 
for the cosmological viability of the latter. The presence 
of the chameleon effect indicates that Eqs (1) and (2) are 
highly nonlinear, making it challenging to numerically 
solve them in the simulation process. 
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An f{R) gravity model is fully specified by the func- 
tional form of f{R), and here we shall adopt the model 
proposed by Hu & Sawicki [29], which takes the form 

f{R) ^ -m p, ox„ , -, , (3) 



C2(-i?/m2)" + 1' 

where n, ci, C2 are model parameters, and 

= n^Hl, (4) 

with rim being the present fractional matter density and 
Hq the current Hubble expansion rate. 
It can then be shown that 

n(-i?/m2)"+i 



ci 



Because 



R « 87rGp,„ - 2f{R) = Sm^ 



2ci 

3 C2 



(5) 



(6) 



where overbars are used for the background quantities, to 
match a ACDM background expansion we have Ci/c2 = 
mK/^m- With ri,„ = 0.24 and Oa = 0.76, we find -R « 
Aim? 3> m? , which means that Jr can be approximated 
as 

2 -| n+1 



fR 



Cl 



-R 



(7) 



Because /ij is the actual quantity that enters the iV-body 
equations (see below) , we find that only the two parame- 
ters n and ^ = ci/cg. are needed in our simulations. An- 
other (independent) parameter which is the present 
background value of /jj, can be obtained from ^ and is 
often used to give people some idea about the size of fn. 



III. THE Af-BODY EQUATIONS 

In this section we shall introduce the convention of 
our code units, and list the A^-body Poisson and f{R) 
equations, the latter being written in the so-called quasi- 
static limit so that terms involving time derivatives will 
be dropped. The A^-body equations can all be found else- 
where, though perhaps of slightly different forms; conse- 
quently, this section shall be kept short and only serves 
for completeness. 



A. The Code Units 

The code unit used in the RAMSES code and its modi- 
fication developed here is based on (but not exactly) the 
supercomoving coordinates of [30] . It can be summarised 
as follows (where tilded quantities are expressed in the 
code unit): 



X 



(i?ifo)2' 



P ' 
di 



pa 

dt 
tin — . 



av 
c 





In the above x is the comoving coordinate, pc is the crit- 
ical density today, Vim the fractional energy density for 
matter today, v the particle velocity, </> the gravitational 
potential and c the speed of light. In addition, B is the 
size of the simulation box in unit of h^^Mpc and Hq the 
Hubble expansion rate today in unit of lOO/i km/s/Mpc. 
Note that with these conventions the average matter den- 
sity is p = 1 . All the newly-defined quantities are dimen- 
sionless. 

In the f{R) gravity equation we also have a new degree 
of freedom fn and in the code unit we will use fa = a? Jr 
instead. As we shall see below, this is to make sure that 
fn has an equivalent status as (/>: the latter is the Newto- 
nian potential and determines the total (modified) gravi- 
tational force, while the former is related to the potential 
of the extra force and determines the modification to the 
standard Einsteinian gravity. 



B. The Modified Poisson Equation 

From above we see that the modified Poisson equation 



IS 



V'0 = 

where 6R — R — R and 



167rG 



1 



5p - -6R 



fR 



R = 3mU a-' + 4 



(8) 

(9) 
(10) 



Using the code units defined above, the equation be- 
comes 



2ri,„a {p - 1) 



io 4 

6 



Jr 



3 a 



LI) 



C. The f{R) Equation 



The equation of motion for /j^. 



(12) 



becomes in the code units, 

fR 



1 

3S' 
1 



-^^mO. {p - 1) 



Jr 



3 a 



(13) 



BH, 



Comparing this equation with that for the modified Pois- 
son equation, we find that (f) and (? fa has the same code 
unit, and that is why we have used instead of Jr. 
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As in [22] , wc will not solve fa directly as it may change 
too quickly from one space point to another and can cause 
divergence problems in the numerical solutions. Instead, 
we will solve a new variable u, defined through /ij = — e". 
u will be of order unity across the space, and so has better 
convergence properties. 



IV. THE DISCRETISED EQUATIONS 

From here on we will only use variables expressed using 
the code unit, and the tildes will be dropped for simplic- 
ity. 

Clearly, to put the above equations into the A^-body 
code one must discretise them. For the Poisson equation, 
it is linear as along as the source term (the right-hand 
side) is provided, so the discretisation is fairly straight- 
forward: 



[4>i+l,j,k + 4>i-l.j,k + + 4'i,j~l,k + 0jj-,fc+l + 4>i,j,k-l — &4>i,j.k] 



2r2,„a ((5ij-,fc — 1) — -fi,„a^ 



{na "+i exp - 



^i,j.k 



-3a 



-3 



(14) 



where for example 4'i_j^k is the value of cj) in the grid cell with index (i, j, k). The discrete version of the nonlinear 

f{R) equation of motion, in contrast, is more tedious [22]: 



1 

7? 



i,feMi,i + l,fe - Uij^k (&ij+i,fe + feij-i,fc) + bi,j-^,kUi,]-l,k 

+ ^n^.a^ {na'O ^ exp (-^) - ^^^a (J,,,, ^ - 1) - ^n^,a^ L'^ + 4^ 



1 
1 



(15) 



Notice that in the above equations we have used 5i,j^k ^ 
Pi,j,k to avoid confusion in notations and to make it clear 
that we are using the density contrast. 6 = e" and 

bi+^j^k = 2 i^i+i,j.k + bi,j,k) , 



V. THE iV-BODY ALGORITHMS 

The detailed implementation of the A-body solvers in 
the RAMSES code is indeed quite different from that in 
the MLAPM code [20, 23]. As a result, certain corrections 
need to be made to the above discrete equations before 
implementing them. 

Both codes adaptivcly refine the grid in high-density 
regions and solve the Poisson and f{R) equations on the 
refinement to get better spatial resolutions. The refining 
is performed on a grid-by-grid basis so that the final re- 
finements typically have irregular shapes roughly match- 
ing the iso-density surfaces. Particles in regions underly- 



ing the refinements are linked to the latter, where their 
positions and momenta are updated. 

An important difference between the two codes is how 
the equations are solved on the refinements. MLAPM solves 
the equations on single level of the considered refinement 
only, performing Gauss-Seidel relaxations till the conver- 
gence of the solution is obtained. In RAMSES, however, for 
each adaptive mesh refinement (AMR) level there is a 
series of separate, coarser multigrid levels on which the 
Gauss-Seidel relaxation is used to accelerate the conver- 
gence. 

The treatments of boundary conditions on refinements 
are also different. In MLAPM, the outermost cells of a re- 
finement are taken as its physical boundary, thus the 
field values wherein arc set using interpolation from the 
coarser-levels solutions. In RAMSES, the use of multilevel 
requires the boundary conditions to be set consistently 
on the different levels, and this is achieved using an ele- 
gant mask scheme: on the AMR level, which is the finest 
level in the multigrid series, the active cells, inside which 
the field values need to be updated during the relaxation 
process, are given a mask value of 4-1 while the other 
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cells are assigned a mask value —1. The boundary of the 
computational domain is defined to be where the linearly- 
interpolated mask value vanishes, or equally the bound- 
ary of the active AMR cells. The value of the field on this 
physical boundary is computed at the beginning of the 
relaxation process and kept unchanged after that until a 
converged solution in the active cells are obtained. The 
mask values in the coarser multigrid cells are restricted 
from their finer-level values, while the physical bound- 
ary on this level is still defined as where the mask hits 
zero, which ensures the boundaries on different levels are 
consistent (for more details and tests in the classical GR 
case see [33]). 

Moreover, the treatments of the boundary conditions 
for linear and nonlinear elliptical PDEs are also different 
in RAMSES, as we shall see below. 



A. The Modified Poisson Equation 

1. The BC-modified Equation 

Unlike the MLAPM code, where the refinement boundary 
consists of the outermost cells on that refinement and so 
the boundary cells are always there, in RAMSES we might 
well face the situation that an outermost active cell has 
no neighbouring cells in some directions on the save level 
(possibly because a neighbouring coarse cell has not been 
refined), while we do need the field values in those cells to 
interpolate and compute the boundary conditions. Such 
cells are called ghost cells: they do not exist in the code 
data structure but we do need them. 

For the linear elliptical PDEs, such as the (modified) 
Poisson equation, there is a simple way to avoid storing 



information about the ghost cells. Consider the discrete 
Poisson equation and now suppose that the {i — 1, j, fc)- 
cell is a ghost, which has mask value rrii-i^j^i^ . We realise 
that the physical boundary, namely where the mask value 
crosses zero, will be a distance 



-h 



''ni,j,k — "^i-lj^fc 

from the {i — 1, j, k) (ghost!) cell and 

''^i;j,k ^ 

from the (i, j, fc)-cell, where h is the cell length, or equally 
the distance between the centres of these two cells. Using 
linear interpolation, the boundary value of (f> will then be 



which fiv«# "ii-i,j,/= 



i-l,j,k 



H-l.j,k 



Ij-l, j,k 



n,j,k, 



^i—l,j.k 
'^i^j.k 



H,j.k- 



(16) 



Note that though the [i — 1, j, fc)-cell is a ghost cell, the 
value of 4>i-ij^k can be computed, at the beginning of the 
relaxation iterations, from its father (coarser) cell which 
does exist. Then 0h is computed and fixed (because it is 
the boundary value!). During the subsequent relaxation 
iterations, the value of (j)i,j.k changes and so does that of 
4>i-i.,j,k but not (/){,. As we don't have the {i — 1, j, /c)-cell, 
there is nowhere to store the updated values of (pi-i^^u, 
and so we choose to not use it at all, replacing (jji-ij^k in 
the discrete equation by (jib and (fiij^k using Eq. (16). We 
then get a boundary-condition (BC)-modified equation 



1 

ft2 



h+l,j,k + 4>i,j + l,k + (t>i,j-lM + 4>i,j,k+l + (t>i,jM-l — I 6 — 



2il„a {S^j^k - 1) - g^^mfl 



(na^e)^ cxpf-^^ 



^i,j,k 



t>i,j,k 



/l2 



rrij-ij.k 



(Pb- (17) 



Note that because 4n, is fixed, we could simply move it 
to the right-hand side of the equation. The BC-modified 
right-hand side is then only needed to be computed once, 
before entering the relaxation iterations. There is no need 
to store either or Note that the left-hand side 

is also modified and one must be careful here. 



^ If this is a gliost cell then its mask value will be — 1 but here we 
try to be general in our description. 



2. The Multigrid Implementation 

Now consider the multigrid algorithm. For simplicity 
let us rewrite the above BC-modified equation as 



L 



hih 



(18) 



in which the superscript means we are on the level with 
cell size h. Note that L'* is a linear operator, which is 
important here. 

Suppose after a number of Gauss-Seidel iterations (pre- 
smoothing) , we get an approximate solution 0'' , then 



- h Ih 



(19) 
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in the active cells, and the difference 

d'' = - f' (20) 

is called the residual. We could define 5(f>^ = (p^ — 0'' and 
rewrite the equation as 

L^5(j)^ = -d^ (21) 

and coarsify this equation as 

L"5(t)" = -Tld^ (22) 

and solve it on the coarser level to accelerate the conver- 
gence. Here, the superscript ^ indicates that we are on 

I 

1 



assuming that the (i — 1, j, k) coarser-level cell is a ghost 
cell. Note that the right-hand side of the coarsified equa- 
tion is not BC modified since ScfiB =0, but the left-hand 
side is modified by the lack of neighbouring cells. 

Here we have only described the algorithm for two lev- 
els, but the generalisation to more levels is trivial. 

Finally some comments should be made on the restric- 
tion operation TZd^ : if a given fine cell is masked^ then it 
has no contribution to the coarser-level residual, because 
the field value in this cell is unmodified by the relaxation 
and so considered to be exact. At the same time, we shall 
not restrict residuals into coarse cells which are masked, 
because this is unnecessary. 

3. The Prolongation 

Once an approximate coarser-level solution to 5(j)^ is 

obtained, say ocj) , one can correct the fine-level solution 
using 

4>'' ^ 4>'' +VS(f>" , (24) 

where V is the prolongation operator, to find (expected) 
more accurate solutions on the fine level. 

In the prolongation process, a fine cell receives contri- 
bution from its eight neighbouring coarser cells. Of these 
eight coarse cells: 

1. one contains the fine cell and is given a weight of 
27/64, 

2. three have one common face with the fine cell and 
are given a weight of 9/64 each. 



^ Here "masked" means having a non-positive mask value. 



the coarser level on which the cell size is H = 2h, and 72. 
is the restriction operator. 



To solve this coarsified equation, we need the boundary 
conditions for S(j)^ on the coarser level and, as in Eq. (17), 
the coarser-level equation will be modified to include cor- 
rection terms involving S(f>B, which is the boundary 5(j) on 
the coarser level. However, on the boundary S(f> vanishes 
identically and it turns out that the coarser-level version 
of Eq. (17) is simpler: 



= -TZd^, (23) 

I 

3. three have one common edge with the fine cell and 
are given a weight of 3/64 each, 

4. one has a common vertex with the fine cell and is 
given a weight of 1/64. 

Of course, these weights sum to unity. 

Note that if a given fine cell is masked, then it is out- 
side the active computational domain and corrections are 
not necessary for it. If the coarse cell is not a valid multi- 
grid cell (it could be a valid AMR cell, though), then its 
contribution to the fine-cell correction is zero. 



B. The f(R) Equation 

Having recalled the multigrid algorithm for the linear 
(modified) Poisson equation, first presented in [33], we 
can now have a look at the nonlinear f{R) equation. The 
nonlincarity introduces certain complications compared 
to the linear case, which should be taken care of in the 
numerical implementations. 



1. The BC-Modified Equation 

As in the linear case, let us suppose that the k)- 
cell on the finest level is a ghost cell. Then 

Mi-lj,fc = M6 "6 H ^fjJ-.A:- (25) 

with the boundary value of u. Note that this necesar- 
illy means that the equahty 

- 6b - ^ '^H b + (26) 



-.H 



- 6- 
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does not hold, because 5 = e'' is nonlinear in u - this tain Uf, before entering the relaxation iterations. In the 

point is very important in order to make consistent BC- subsequent iterations, Uij_k is updated and so is 

modified f{R) equation. Instead, we shall use bi-ij^k = but not ui,. 

exp(ui_i jj;) with Ui-ij^k given by Eq. (25). Removing the Ui-ij^k and bi-ij^k in Eq. (15) using the 

above two relations, we arrive at the BC modified f{R) 

As before, Ui-ij_k is computed and then used to ob- equation: 

I 



C2 



2/l2 

2/12"''^''= 
52 



bi+i,j,k + + bi,j-i,k + bij^k+i + bij^k-i - 



Ui.j^k 



1 - 



+ -^rr^ \bi+i,j.kUi+i.j^k + &i,j+i,fcWi,j+i,fe + + 6i,j,/£+itiij-,fe+i + bij,k-iUij^k-i 



2h^ 



rritj^k 



3 V n + 1 



4 „-3 



— ^mO'{Pi,j,k — 1) 



0. 



(27) 



Now we could see clearly where the additional complex- 
ity appears. Remember that in the linear case we don't 
have to store (j)b because it only appears on the BC- 
modified right-hand side of the equation. Here, on the 
other hand, also appears in the term Ubbtj^k such that 



its value is needed in each iteration during which bi_j^k 
is updated. This implies that, for each outermost active 
cell, we should store Ub for its neighbouring ghost cell(s). 

If we define the the left-hand side of Eq. (27) as for 
simplicity, then it can be easily shown that 



dC^iu.^j^k) 



duij.k 



2/i2 



^i,j,k 
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(28) 



Note that when rrii^ij^k = and Ub = Ui^ij_k, the above 
equations reduce to those for a regular grid with periodic 
boundary conditions as expected. The relaxation on this 
level is then done through 



/i,ncw 



h,old 



Q /i.old 



(29) 



2. The MuHigrid Implementation 



Let us write the BC-modified f{R) equation as 

r (30) 



/ _ fh 



on the fine level, where C is the nonlinear operator acting 
on m'' defined above. Note that here = 0, but we shall 
still keep it for a reason which will become clear below. 

After a number of pre-smoothing iterations, one finds 
an approximate solution u'* on the fine level, which gives 



(31) 
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Consider the new equation 



-c 



= /''-/'' = -d'*. (32) 



After coarsifying and rearranging, we obtain the coarser 
level equation as 



C 



(33) 



After a number of relaxation iterations on the coarser 
level, we find an approximate solution u^, and the fine- 
level solution can be corrected as 



- /t .ncw 



'h.oXd 



(34) 



where V is the prolongation operator. 

Different from the linear case, here on the coarser level 
we are not solving an equation for the correction to the 
field u (remember in that case we solved 5(j) on the coarser 
level) but again u itself. Therefore Eqs. (27, 28) could be 
applied to the coarse level as well, if we 

1. change h to H , and 

2. find correct coarser- level boundary values for u, ub- 

The first point is fairly straightforward, while the second 
needs some further analysis. One must first find the phys- 
ical boundary on the coarser level which, as explained 
above, is where the coarser-level mask crosses zero. 
This is easy because is computed by restricting the 
fine-cell mask values. The values of ub in the boundary 
coarse cells are taken as those in the corresponding AMR 
cells. 

Finally some comments should be made on the restric- 
tion operations TZd^ and TZu'^. TZd''^ will be treated simi- 
larly as in the linear case for the same reasons explained 
there. 

For TZii'^, even though a given fine cell is masked it still 
has contribution, because otherwise the computed TZu'^ 
for the coarser cell will be incorrect. But if a coarser cell 
is masked we will not compute TZu^ for it, since this will 
be unused anyway, as in the case of TZd'^ . 

The truncation error can be estimated as 



c" {nil'') - nc'' 



(35) 



and similarly for other levels. This could provide a stop- 
ping (convergence) criterion for the multigrid iteration, 
which in our case is 

\d''\<a\T''\, (36) 
where a < 1/3 is a predefined constant. 

3. The Prolongation 

As mentioned above, in the nonlinear case we have to 
prolongate the quantity {u^ —TZu'^), but not the residual 
itself. The prolongated result can then be used to correct 
the fine-level solutions. 

The prolongation here is done in the same way as that 
for the residual of the linear (modified) Poisson equation, 
to be consistent. Details shall not be presented here. 



VI. CODE TESTS 

In this section we shall give the results of some tests 
we have performed to show that the above algorithm and 
implementation work correctly and efficiently. 

A. f{R) Equation Solver On Domain Grid 

The most important part in the modified RAMSES code, 
which is the topic of the last two sections, is the equa- 
tion of motion for f{R) gravity or the new degree of free- 
dom(s) in other theories. Here we have performed a range 
of tests for it with different configurations of the matter 
density field. 

1. Homogeneous Matter Density Field 

In a universe with homogeneous density, we know that 
the quantity fn is homogeneous and it exactly takes its 
background value fa, namely 



0„ 



n+1 ' 



(37) 



everywhere. Therefore, as the simplest test of the f{R) 
equation solver, one has to show that in such a homoge- 
neous field, given some random guess of fa on the cells of 
the simulation mesh, after a reasonable number of Gauss- 
Seidel relaxation sweeps, the solution converges to the 
above background value. Such simple test have been used 
previously in [26, 27] to show that the MLAPM solver for 
extra degrees of freedom works well. 

We have performed this test for a sa 0.04, n = 1 and 
a value of ^ corresponding to |/flo| = 10~^. The result is 
shown in Fig. 1, where we plot the values of m = log(— /i?) 
in the cells in x-direction, before and after the Gauss- 
Seidel relaxation, for two different random initial guesses. 
We can see that the final solution agrees with the ana- 
lytical result (the horizontal line) very well (see figure 
caption for more details). 



2. Point Mass 

As a second test of the RAMSES f{R) equation solver, 
let us consider the solution of around a point mass at 
the origin, for which case we have an analytical solution 
which is accurate except for the regions very close to the 
mass. Such a test has been used previously in [15, 26]. 

Following [15], we construct the point- mass density 
field as 



10-4 
-10- 



{N^ - 1) , i=j = k^O; 
4, otherwise. 



(38) 



in which i,j,k are respectively the cell indices in x,y,z 
direction. In the test we have used a cubic box with size 
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FIG. 1. (Colour online) Test of the solver for the f{R) equa- 
tion in a constant matter density field. Only results in the cells 
along the s-axis are shown, and the s-coordinate is rescaled 
by the size of the simulation box so that x £ [0, 1]. Two ini- 
tial guesses of u = log(— /a) have been tried (the red squares 
and blue triangles), the final answer corresponding to which 
are respectively denoted by green filled circles and purple di- 
amonds. The black horizontal line is the exact solution. Note 
that the results with the first initial guess (filled squares and 
circles) have been shifted rightwards to make the plot clearer. 



256/i~^Mpc and 128 grid cells in each direction. The 
other physical parameters are chosen as a = 1 , n = 1 
and we have used three values of ^ corresponding to 

i/flol = lo-Mo-^lo-4. 

Meanwhile, the analytical solution can be obtained ap- 
proximately by solving the equation 



(39) 



in which the effective mass of the scalar degree of freedom 
S/r, TOeff , is given by 



1 



1 



Weff 



V3n(n + l)e 



3 1 



n+2 



r!™7?2.(40) 



Fig. 2 shows the comparison between the numerical 
solutions to Sffi along the cc-axis (symbols) and analytical 
solutions (solid curves) , and we can see that the two agree 
very well for all three models used in this test. 



FIG. 2. (Colour online) The solution to S/r = fn — fR around 
a pointed mass constructed according to Eq. (38), for three 
models with |/ijo| = 10^® (blue triangles), 10~^ (green filled 
circles) and 10~* (red squares) respectively. The solid curves 
are the corresponding analytical approximations which are 
accurately far from the point mass. Only the solutions along 
the a;-axis are shown. 



count for the code units) is given by 



5{x) 



-i2n) 



3 (a-3 + 4g^ 



J sin(27ra;) 



— sin(27ra;)] 



— _ 1 



}(41) 



in which x is rescaled such that x E [0, 1]. Notice that we 
consider only a;-dependence, which is equivalent to a one- 
dimensional configuration. The solution to this density 
field can be analytically worked out to be. 



3 (a-3 -t-4g^ 



n+ 



J [sin(27rx) - 2] . (42) 



Fig. 3 shows the test results for the sine density field 
given above, with a = n = I and three values of ^ corre- 
sponding to l/flol = 10^^, 10"^, 10""* respectively. It can 
be seen that the numerical solutions (symbols) agree with 
the analytical solutions (solid curves) very well. 



4- Gaussian Density Field 

3. Sine Density Field 

The last test of the f{R) equation solver on the domain 
As our third test, let us consider the sine density field grid uses a Gaussian type density configuration. Again, 
introduced in [15], which (after some modification to ac- here we only consider one-dimcnsional case, where the 
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FIG. 3. (Colour online) Solutions of fn in a one-dimensional 
(a::-direction) sine density field constructed using Eq. (41), for 
three models with |/ijo| = lO^'' (blue triangles), 10^* (green 
filled circles) and 10~* (red squares) respectively. The solid 
curves are the corresponding analytical results. A simulation 
box with side length of 256/i~^Mpc and 256 grid cells on each 
side is used in the computation, x is in physical units. 



density field is specified by 



Six) 




where again x has been scaled to code units so that x e 
[0,1], W, a are numerical constants which respectively 
specify the width and height of the density field, which 
obviously peaks at x — 0.5, and A is defined by 



A 



,-3 



(44) 



Note that such a density field is not exactly periodic at 
the edges of the simulation box, but given that W is small 
enough, (5 — >■ at the box edges and periodic boundary 
conditions are approximately satisfied. 

The solution to ffi can then be obtained analytically 
and is 



1 — a exp 



0.5 



i2\ 1 



(45) 



which clearly shows that when a - 
very small at x = 0.5 while at x 
to its background value. 
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FIG. 4. (Colour online) Solutions of (r in a one-dimensional 
(a;-direction) Gaussian-type density configuration constructed 
using Eq. (43), for the models with [fm] = 10~^ and a — 0.5 
(red circles), 0.99 (green triangles) and 0.999 (blue squares). 
The solid curves are the corresponding analytical results from 
Eq. (45). A simulation box with side length of 256/i~^Mpc and 
256 grid cells on each side is used in the computation and the 
f{R) equation is only solved on the regular domain grid, x is 
in physical units. 



We have implemented Eq. (43) into our numerical code 
and the numerical solutions for fj^ are shown in Fig. 4. 
For simplicity we only show the results for |/flo| = 10~^, 
but for three different values of a = 0.5, 0.99 and 0.999 
(the open circles, open triangles and open squares in 
Fig. 4 respectively). We can see that the numerical results 
agree with the analytical solution Eq. (45) very well. 



B. f{R) Equation Solver On Refinements 

The above four tests show that our solver for the f{R) 
equation actually works accurately on the regular domain 
grid, but this is not sufficient for the code test since the 
f{R) equation is also solved on refinements where, as we 
have seen, the equations take different forms due the com- 
plex treatment of the boundary conditions. It is therefore 
essential to test the f{R) equation solver on the refine- 
ments as well, as we shall do in this subsection. 



1. Two-level Gaussian Density Field 

The Gaussian-type density configuration could pro- 
vide a good check of the multilevel f{R) equation solver 
because the density peak can be made arbitrarily high 
by adjusting the parameter a. Near its peak, the den- 
sity field changes rapidly and a higher spatial resolu- 
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FIG. 5. (Colour online) The same as Fig. 4, but for the models 
with = 10"^ and a = 0.99,0.999,0.9999,0.99999 (from 
top to bottom: red, green, blue, magenta). The f{R) equation 
is solved on two levels: level 8 (the regular domain grid) and 
level 9 (the first refinement), and their numerical solutions are 
represented by filled squares and empty squares respectively. 
The solid curves are the corresponding analytical results from 
Eq. (45). 
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FIG. 6. (Colour online) Numerical errors of fn (defined as 
the fractional difference between the numerical result and the 
exact solution Eq. (45)) for the models with |/j;o| ~ 10~'' 
and a = 0.99 (upper left panel), 0.999 (upper right), 0.9999 
(lower left) and 0.99999 (lower right). The f{R) equation is 
solved on two levels: level 8 (the regular domain grid, red 
filled diamonds) and level 9 (the first refinement, blue hollow 
circles). 



tion is needed to compute accurately. Consider the 
case where the regular domain grid is refined only once, 
in regions where the density value exceeds some certain 
threshold (we shall call this a two-level problem, and in 
the present numeric example the coarse and fine levels are 
respectively levels 8 and 9). The density values in both 
the coarse and refined cells are given by Eq. (43), while 
the values of at the fine-level boundaries are com- 
puted from interpolation of those in the nearby coarse- 
level cells, as discussed above. 

Fig. 5 shows the numerical values of fn on both levels 
in the region covered by the refinement. We show the 
results for four different values of a (0.99, 0.999, 0.9999, 
0.99999 from top to bottom), and for each a the results 
from the coarse and fine levels are denoted respectively by 
filled squares and empty circles. For comparison we have 
also plotted the analytical results Eq. (45) as solid curves. 
As we can sec, both fine-level and coarse-level results arc 
virtually indistinguishable from the exact solution by eye. 

This does not mean that the refinement is unnecessary 
however, because, as shown in Fig. 5, the fine level has 
more data points and could probe regions closer to the ex- 
treme value of fa, which corresponds to the high-density 
region where high resolution is needed. 

To see more quantitatively how well the solutions from 
different levels agree with the exact results, we have plot- 
ted in Fig. 6 the percentage errors of the numerical solu- 
tions for the same four models considered in Fig. 5. The 
following features could be easily observed from this plot: 



1. At the coarse level (level 8), the numerical error in- 
creases as the density fluctuation becomes stronger 
(i.e., a — > 1), which is as expected; 

2. Around the peak of the density field, the fine level 
(level 9) consistently gives more accurate numerical 
results than level 8 for all 4 models, thanks to its 
higher spatial resolution; 

3. The improvement of level-9 result over that of level- 
8 is smaller as a — )■ 1, because at the peak of the 
density field even level 9 will not be very accurate; 

4. The levcl-9 result is actually less accurate at the re- 
finement boundaries, because of the numerical er- 
ror that comes from setting boundary conditions 
using interpolation. Furthermore, the error at re- 
finement boundaries is the same for all models, at 
^ 0.025%, which is negligible anyway. Note that we 
could improve on this using a higher-order interpo- 
lation scheme at boundaries, which will be consid- 
ered in future work. 



2. Three-level Gaussian Density Field 

In the above we have performed a test using the do- 
main grid and one refinement level, but the generalisation 
to more refinement levels is very straightforward (and in- 
deed done automatically in the code, with the refinement 
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FIG. 7. (Colour online) Upper Panel: The one-dimensional 
matter density field as a function of the x coordinate (in phys- 
ical units), to provide an impression about the height of its 
peak. 5m = Pm/pm — 1- Middle Panel: The same as Fig. 5, 
but for the model with |/ko| = 10"^ and a = 0.99999 only for 
simplicity. The f{R) equation is solved on three levels: level 
8 (the regular domain grid) and level 9 (the first refinement) 
and level 10 (the second refinement), and their numerical so- 
lutions are represented by red squares, green circles and blue 
triangles respectively. The black solid curve is the correspond- 
ing analytical result from Eq. (45). Lower Panel: The same 
as Fig. 6, but for the model studied here; the use of symbols 
is the same as in middle panel. 




kh/Mpc 

FIG. 8. (Colour online) Upper Panel: Comparison of the non- 
linear matter power spectra using the TSC (blue diamonds) 
and CIC (red squares) density assignment schemes to that ob- 
tained using Smith et al. fit (black solid curve). Lower Panel: 
The relative differences of the TSC and CIC power spectra 
with respect to that of the Smith et al. fit. The dashed black 
line is zero identically. The cosmological simulations here use 
a cubic box with size 100/i~^Mpc and 256 cells on each side 
of the domain grid. The error bars show the sample variance 
over 5 different realisations. All results are at redshift 0. 



in the spatial regions covered by them, and again we see 
that the agreements with the exact solution are very good 
(indistinguishable by eye) . The numerical errors from the 
f{R) equation solver at the three levels are shown in the 
lower panel, and we see the same patterns as observed in 
Fig. 6, namely near the density peak higher refinement 
level produces smaller error, but at the refinement edges 
slightly bigger error occurs because of the interpolation 
used in setting the boundary conditions. The magnitude 
of the latter error source, however, is much smaller than 
that near the density peak, and is negligible for cosmo- 
logical simulations anyway. 

These tests indicate that our f(R) equation solver is 
reasonably accurate and therefore reliable also on the re- 
finement (s), which is one of the most significant achieve- 
ments of the present code. 



criterion specified a priori). Here we only give an exam- 
ple with three refinement levels, again making use of the 
above Gaussian-type density field. 

Fig. 7 shows the results of the three-level test. In the 
upper panel we have plotted the density field 5m as func- 
tion of cc, and it could be seen that it has a rather high 
and sharp peak near x = 128/i~^Mpc, which is exactly 
where higher spatial resolution and thus refinements are 
needed. In the middle panel we have shown the numerical 
solutions from the three refinement levels (8, 9 and 10) 



C. Density Assignment and Force Interpolation 

The default RAMSES code uses the CIC (cloud-in-cell) 
scheme to do density assignment and force interpola- 
tion. While the CIC scheme works well and is the most 
widely used due to a compromise of accuracy and cost, 
the TSC (triangle-shaped clouds) scheme actually pro- 
duces a smoother density field which more resembles the 
true dark matter distribution, especially when the parti- 
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TABLE I. A brief summary of the key features of the three known grid-based codes for A^-body simulations in modified gravity 
and/or dynamical dark energy theories. Here OC stands for Oyaizu Code, and by MLAPM we mean the modified version to include 
the multigrid solver for the extra degree(s) of freedom. 



Codes 


OC 


MLAPM 


ECOSMOG 


Reference 


Ref. [15] 


Refs. [20, 23] 


This paper 


Density assignment scheme 


CIC 


TSC 


TSC 


Force interpolation scheme 


CIC 


TSC 


TSC 


Parallelisation 


OpenMP 


N/A 


MPI 


Adaptive refinement? 


No 


Yes 


Yes 


Multigrid on regular grid? 


Yes 


Yes 


Yes 


Multigrid relaxation arrangement on regular grid 


V-cycle 


V-cycle/adaptive cycle 


V-cycle 


Multigrid on refinement? 


N/A 


No 


Yes 


Multigrid relaxation arrangement on refinement 


N/A 


N/A 


V-cycle 


Programming language 


C++ 


C 


FORTRAN 90 



cle number is not large enough. Furthermore, this results 
in a more isotropic foree around a point mass, a desirable 
property. For this reason, we have added a new routine in 
the code to do the TSC density assignment. Correspond- 
ingly, the force interpolation also needs to use the TSC 
scheme to ensure momentum conservation. For more dis- 
cussions on the different density assignment schemes we 
refer the readers to [31]. 

Because in the TSC density assignment scheme a par- 
ticle's cloud ^ spreads more than in the CIC scheme, the 
resulted density field will be smoother and as a result 
we expect less small-scale structure from iV-body simu- 
lations. This is confirmed by our numerical results shown 
in Fig. 8, where we have plotted the nonlinear matter 
power spectra from our ACDM runs using TSC and CIC 
respectively, and compared them with the Smith et al. fit 
[32]. We can see that on very large scales (small k) TSC 
gives virtually the same prediction as CIC, but on smaller 
scales (large k) it produces less power than the latter. On 
the other hand, the difference between TSC and CIC (a 
few percent) is much smaller than their difference from 
the Smith et al. fit (up to 40% on small scales). 

A simpler test of the code segments for the TSC density 
assignment is to check that the average density of all grid 
cells on the domain grid is 1.0. We have monitored this 
at each time step of all our cosmological simulations and 
found that this is indeed the case. As a result, we believe 
that this part of our ECOSMOG code is reliable. 



D. Performance Tests 

In addition to a more accurate f{R) equation solver, 
another aim we want to achieve for our code is efficiency. 
The efficiency in the f{R) equation solver can be greatly 
improved by two factors. The first (as mentioned earlier) 
is to use multigrid rather than a single grid in arranging 
the Gauss-Seidel relaxation to speed up its convergence, 
and the other is using parallelisation to take advantage 
of the supercomputing resources. These two elements are 
incorporated in the original RAMSES code for the Pois- 
son solver, and in our code we apply them to the f(R) 
equation solver as well. 



1. Performance of V-cycle 

We use a V-cycle to arrange the multigrid relaxation, 
as in the modified MLAPM code [20, 23]. As a check of 
the improvement that is achieved in this way, we have 
plotted in Fig. 9 the convergence rates for three mod- 
els (details of which can be found in the figure caption) 
using single-grid relaxation (solid curves) and multigrid 
V-cycle (dashed curves). The convergence rate is equiv- 
alent to the rate at which the module of the residual 
decreases. We can see that with the multigrid relaxation 
the residual drops below 10""'^^ after tens of Gauss-Seidel 
sweeps on the finest grid, while with the single-grid re- 
laxation this depends sensitively on both the model and 
its parameters - in the worst case we have tested here 
(the sine density field with fn^ = —10"'', solid curve 
with filled squares in Fig. 9) the residual is still bigger 
than 10^** even after 1000 Gauss-Seidel sweeps!'^ Clearly 



In density assignment, a particle only contributes to the density 
field within a finite region around it, and this is called that parti- 
cle's cloud. The particle's contribution to density field integrated 
over its cloud is its mass. As a result, the bigger the cloud is, the 
smoother the resulted density field will be. 



* Of course this also depends on the initial guess - if it is close 
enough to the true solution then rapid convergence could be ex- 
pected. But in most situations we have no idea about the exact 
solution and cannot find such good initial guesses. 
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FIG. 9. (Colour online) The reduction of the residual for the 
f{R) equation for three models: |/flo| = 10^'' with a sine-type 
density field as discussed above (red squares), |/i?o| = 10~^ 
with a sine-type density field (green circles) and Ifm] = 10"* 
for a point mass discussed above (blue triangles). The results 
using a single grid are represented by symbols connected by 
solid curves, while those using the multigrid method (arranged 
by V-cycles) by symbols connected by dashed curves. 

the single-grid relaxation is impractical in most realistic 
simulations. 

One difference between the code here and the modi- 
fied MLAPM [20, 23] is that here the multigrid relaxation 
method is used not only on the regular domain grid, but 
also on the refinements. In contrast, in MLAPM single-grid 
relaxation is used on refinements, which gains ease in the 
code implementation but at the cost of computing time 
and accuracy. Tests of our code show that on the refine- 
ments the V-cycle could bring the residual down to 10~^^ 
within a reasonable number of Gauss-Seidel sweeps, just 
as it does on the regular grid. Therefore, we expect more 
accurate results than those obtained in previous works; 
we will come back to this in a future work. 



2. ParalleUsation Performance 

Parallelisation is a way to let many CPUs share the 
computational overhead and therefore reduce the total 
physical time needed to finish the job. It has greatly im- 
proved the efficiency of numerical simulations in cosmol- 
ogy. Our parallelisation here uses the structure of the 
original RAMSES code, which is parallelised using MPI and 
has been shown to work very well for the Poisson equation 
solver. This is made possible by the fact that the mesh 
structures used for the Poisson and f{R) equations are 
exactly the same, and so the structures for communica- 
tion (which is a crucial ingredient of parallelisation) could 



FIG. 10. (Colour online) The wallclock time of the f[R) equa- 
tion solver as a function of the number of CPUs used in the 
parallelised computation. The actual measurement is shown 
as the black solid curve, while the blue dashed line is a linear 
scaling just for comparison. It is obvious that the time-CPU 
number scaling is approximately linear for CPU numbers up 
to ~ 100. 

be shared by the Poisson and f[R) equation solvers, pro- 
vided that they will not be modified until both equations 
are solved and relevant quantities appropriately stored. 

To check the efficiency of the parallelisation, we have 
used different numbers of CPUs to solve the f{R) equa- 
tion (only once) and recorded the wallclock times used. 
This is plotted in Fig. 10 from which we can see that, at 
least when the CPU number ranges between 1 and ^ 100, 
the wallclock time roughly scales linearly with the num- 
ber of CPUs. This shows that the parallelisation does 
help to make the code faster, and therefore most suitable 
for future simulations with large numbers of particles, 
large box sizes and high spatial resolutions. 



E. Cosmological Simulations 

The aim of the modified code is to run cosmological 
simulations for modified gravity and dynamical dark en- 
ergy theories, f{R) gravity as an example. Therefore, we 
still need to test its reliability in such simulations, which 
is the purpose of this subsection. 

As one cosmological test, we have performed simula- 
tions for the f{R) gravity model with l/ijol = 10~^ using 
a 100/i~^Mpc simulation box and measured their matter 
power spectra. In order to reduce the cosmic variance, we 
show the matter spectrum for f{R) gravity rescaled by 
that for a ACDM model simulated using the same initial 
condition. To reduce the sample variance, we make bins 
uniform in logarithm of the wavenumber k. 
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FIG. 11. (Colour online) Nonlinear power spectra at redshifts 
1 (black filled squares) and (red filled circles), from cosmo- 
logical simulations for the f(R) model with |/flo| ~ 10~^. The 
horizontal axis is the wave number in the unit of /iMpc~^, and 
the vertical axis is the fractional difference between the f{R) 
and ACDM results. We have considered five realisations for 
each model, using the same set of initial conditions, box size 
(100/i~^Mpc), spatial resolution (256 cells on each side for the 
domain grid) and refinement criterion. The power spectra are 
averaged and the standard deviations are shown as error bars. 
We have also binned the horizontal axis to smooth the result 
[22] . The corresponding results from [22] have been plotted as 
black empty squares and red empty circles for comparison. 

The numerical results arc summarised in Fig. 11 (more 
teehnical details are described in the caption). To make 
comparison with the ACDM result, displayed in Fig. 11 
are the relative differences between the f{R) and ACDM 
power spectra. At redshift z = 1 {a = 0.5), the fractional 
difference monotonically increases towards smaller scales, 
which have already been within the reach of the fifth force 
and started experiencing the boosted matter clustering; 
at redshift (a = 1.0) the fractional difference goes down 
at smaller scales, because the particles have been accel- 
erated, which helps to erase the small-scale structure to 
a certain degree [22]. 

We have also compared the results from ECOSMOG and 
those from the modified MLAPM code (empty symbols) in 
Fig. 11, which shows that the two codes agree quite well 
on large scales. On smaller scales, the quantitative agree- 
ment becomes worse but the qualitative features are still 
the same. In particular, the results at z — 0.0 are within 
the la error bars of each other. The discrepancy on small 
scales comes from the fact that the ECOSMOG code could 
achieve much higher accuracy than MLAPM: on the domain 
grid (with 256 cells in our ECOSMOG runs and 128 cells in 
the MLAPM runs) the convergence criterion is \d^\ < 10^^^ 
for the former code and \d^\ < max (10~^, 0.03|r''|) for 
the latter; on the refinements it becomes | < 10~^ and 



FIG. 12. (Colour online) Nonlinear matter power spectra at 
redshift 0, from cosmological simulations for the f{R) models 
with |/i?o| = 10~* (red empty squares), 10~^ (green empty 
circles) and 10~^ (blue empty triangles). The horizontal axis 
is the wave number in the unit of hMpc~^ , and the vertical 
axis is the fractional difference between the f{R) and ACDM 
results. The solid curves are the corresponding results from 
Smith et al. fit. 

jd'^l < max(l0-^0.03|r''|) for the two codes^ 

Because <C 1 appears in the denominator of the 
right-hand side of the Poisson equation, a small difference 
in fn can be magnified and result in a big difference in 
the solution of the Newtonian potential. This is why a 
high accuracy in the solution of fu is important, and can 
also explain why the difference between the ECOSMOG and 
MLAPM results in Fig. 11 is bigger at earlier times (when 
Iful is smaller and its accuracy more important). Indeed, 
we find the small-scale structure in the f{R) gravity to 
be quite sensitive to the convergence criterion of the f{R) 
equation solver as well as the size of the simulation box, 
the refinement criterion etc., and a thorough study needs 
to be conducted to clarify these issues before systematic 
simulations of the f{R) gravity are performed. We leave 
this as a future work. 

As the other cosmological test, we have also run simu- 
lations for three /(i?) gravity models with \fRo\ = 10~^, 
10~^ and 10~^ respectively, for a bigger simulation box 
(500/i^^Mpc now). Because the measured matter power 



^ We note that on refinements it is almost always the case that 
10~^ <S 0.03|t''|, because the truncation error can be quite big 
on the fine levels. This moans that the MLAPM f{R) equation solver 
was deemed as having convorg od even when ~ 0(0.01 - 1) 
on the higher refinements, while the ECOSMOG solver has a much 
more stringent criterion of \d^\ < 10^*, and naturally will give 
more accurate results. Without using multigird relaxation on the 
refinements, such an accuracy will not become practical. 
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spectra are much smoother than the case above, here we 
shaU not average over reahsations or perform the binning. 
Fig. 12 shows the enhancements of the power spectra in 
these models (symbols), and also the results using Smith 
et al. fit (dashed curves). We could see that on the large 
scales they agree reasonably (for k <C 1/iMpc^^ the dif- 
ference is at most a few percent). This serves as another 
check of the consistency and reliability of our code. 

As a final note, we find that simulations using bigger 
boxes generally takes less time to complete, provided that 
the particle numbers are the same. This is because the 
clustering is less developed and the cell size is larger, so 
that smaller scales are not resolved as well as for smaller 
box sizes. As a consequence, time steps are larger, and 
there are few refinement levels, even with the same re- 
finement criterion. The density field is smoother, which 
makes the /(i?) equation solver converge more quickly. 
With 80 CPUs, our simulations with 500/i~^Mpc box size 
can finish within a couple of hours, in contrast to about 
24 hours for the simulations with 100/i^^Mpc box size. 

VII. SUMMARY AND CONCLUSIONS 

Large A^-body simulations for modified gravity or dy- 
namical dark energy theories have become more and more 
important in the wake of the inflow of high-quality ob- 
servational data in the near future. To best extract in- 
formation from these data, one needs better theoretical 
understandings of different models, in particular on scales 
of galaxy cluster and smaller sizes, where the commonly- 
used linear perturbation analyses arc no longer valid. 

In this work wc have presented a new. efficiently par- 
allelised, accurate and fast code called ECOSMOG based on 
the public code RAMSES. The ECOSMOG code is specifically 
designed for performing large A^-body and hydrodynamic 
simulations for modified gravity and generic dark energy 
theories with additional scalar degree(s) of freedom. We 
have clarified a number of numerical and technical issues 
about implementing the (usually nonlinear) equations of 



motion for those new degrees of freedom, and preformed 
a series of code tests, using f{R) models, for the issues 
of the accuracy of the multigrid scalar field solver on the 
domain grid, on the refinements, the density assignment 
and the force interpolation scheme, as well as the efficien- 
cies of the scalar field solver and of the parallelisation. 
Our results show that ECOSMOG works extremely well. 

ECOSMOG closely follows the convention and style of the 
original RAMSES code. Special efforts have been made to 
not mess the default code, and the scalar field solver has 
been designed as several additional F90 files which are 
maximally parallel to the RAMSES Poisson solver. There 
are some modifications to the default code, mainly to en- 
sure smooth interfaces (such as make sure useful quan- 
tities are not destroyed between the calls to the scalar 
field and Poisson solvers), but these have been kept to 
minimal level. 

ECOSMOG is easy to use and modify for other gravity or 
dark energy models, making it a potentially powerful tool 
for massive A^-body and hydro dynamical simulations for 
interesting theories of gravity and dynamical dark energy. 
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